Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 10000 |
| Missing cells | 22209 |
| Missing cells (%) | 12.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.4 MiB |
| Average record size in memory | 144.0 B |
Variable types
| Numeric | 13 |
|---|---|
| Categorical | 5 |
countries has constant value "France" | Constant |
product_name has a high cardinality: 9453 distinct values | High cardinality |
brands has a high cardinality: 4223 distinct values | High cardinality |
ingredients_text has a high cardinality: 7791 distinct values | High cardinality |
df_index is highly correlated with Unnamed: 0 | High correlation |
Unnamed: 0 is highly correlated with df_index | High correlation |
energy_100g is highly correlated with calculated_energy | High correlation |
calculated_energy is highly correlated with energy_100g | High correlation |
nutrition_grade_fr is highly correlated with countries | High correlation |
countries is highly correlated with nutrition_grade_fr | High correlation |
ingredients_text has 2094 (20.9%) missing values | Missing |
nutrition_grade_fr has 396 (4.0%) missing values | Missing |
fat_100g has 1727 (17.3%) missing values | Missing |
saturated-fat_100g has 248 (2.5%) missing values | Missing |
carbohydrates_100g has 1784 (17.8%) missing values | Missing |
sugars_100g has 245 (2.5%) missing values | Missing |
fiber_100g has 3368 (33.7%) missing values | Missing |
salt_100g has 238 (2.4%) missing values | Missing |
nutrition-score-fr_100g has 396 (4.0%) missing values | Missing |
fruits-vegetables-nuts_100g has 9660 (96.6%) missing values | Missing |
calculated_energy has 1814 (18.1%) missing values | Missing |
product_name is uniformly distributed | Uniform |
ingredients_text is uniformly distributed | Uniform |
df_index has unique values | Unique |
Unnamed: 0 has unique values | Unique |
fat_100g has 729 (7.3%) zeros | Zeros |
saturated-fat_100g has 1304 (13.0%) zeros | Zeros |
carbohydrates_100g has 413 (4.1%) zeros | Zeros |
sugars_100g has 722 (7.2%) zeros | Zeros |
fiber_100g has 1943 (19.4%) zeros | Zeros |
proteins_100g has 662 (6.6%) zeros | Zeros |
salt_100g has 999 (10.0%) zeros | Zeros |
nutrition-score-fr_100g has 500 (5.0%) zeros | Zeros |
Reproduction
| Analysis started | 2021-03-23 14:51:43.994731 |
|---|---|
| Analysis finished | 2021-03-23 14:52:15.353422 |
| Duration | 31.36 seconds |
| Software version | pandas-profiling v2.12.0 |
| Download configuration | config.yaml |
| Distinct | 10000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45085.1871 |
| Minimum | 1 |
|---|---|
| Maximum | 91029 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4387.45 |
| Q1 | 22148 |
| median | 44467.5 |
| Q3 | 67997.75 |
| 95-th percentile | 86499.8 |
| Maximum | 91029 |
| Range | 91028 |
| Interquartile range (IQR) | 45849.75 |
Descriptive statistics
| Standard deviation | 26382.55195 |
|---|---|
| Coefficient of variation (CV) | 0.5851711758 |
| Kurtosis | -1.20770192 |
| Mean | 45085.1871 |
| Median Absolute Deviation (MAD) | 22857 |
| Skewness | 0.03169901282 |
| Sum | 450851871 |
| Variance | 696039047.3 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 45259 | 1 | < 0.1% |
| 76480 | 1 | < 0.1% |
| 86711 | 1 | < 0.1% |
| 76472 | 1 | < 0.1% |
| 82455 | 1 | < 0.1% |
| 35765 | 1 | < 0.1% |
| 17085 | 1 | < 0.1% |
| 23230 | 1 | < 0.1% |
| 2528 | 1 | < 0.1% |
| 58049 | 1 | < 0.1% |
| Other values (9990) | 9990 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 4 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 91029 | 1 | |
| 91009 | 1 | |
| 90982 | 1 | |
| 90958 | 1 | |
| 90952 | 1 |
| Distinct | 10000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 245783.8975 |
| Minimum | 189 |
|---|---|
| Maximum | 355977 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 189 |
|---|---|
| 5-th percentile | 182962.75 |
| Q1 | 212010.5 |
| median | 240782.5 |
| Q3 | 272517.75 |
| 95-th percentile | 338594.5 |
| Maximum | 355977 |
| Range | 355788 |
| Interquartile range (IQR) | 60507.25 |
Descriptive statistics
| Standard deviation | 50659.43284 |
|---|---|
| Coefficient of variation (CV) | 0.2061137176 |
| Kurtosis | 1.777024105 |
| Mean | 245783.8975 |
| Median Absolute Deviation (MAD) | 29976.5 |
| Skewness | -0.2291782459 |
| Sum | 2457838975 |
| Variance | 2566378135 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 268808 | 1 | < 0.1% |
| 273136 | 1 | < 0.1% |
| 207584 | 1 | < 0.1% |
| 255443 | 1 | < 0.1% |
| 285414 | 1 | < 0.1% |
| 264941 | 1 | < 0.1% |
| 211690 | 1 | < 0.1% |
| 242411 | 1 | < 0.1% |
| 252654 | 1 | < 0.1% |
| 336625 | 1 | < 0.1% |
| Other values (9990) | 9990 |
| Value | Count | Frequency (%) |
| 189 | 1 | |
| 194 | 1 | |
| 242 | 1 | |
| 249 | 1 | |
| 462 | 1 |
| Value | Count | Frequency (%) |
| 355977 | 1 | |
| 355876 | 1 | |
| 355811 | 1 | |
| 355732 | 1 | |
| 355719 | 1 |
| Distinct | 9453 |
|---|---|
| Distinct (%) | 94.8% |
| Missing | 33 |
| Missing (%) | 0.3% |
| Memory size | 78.2 KiB |
| Huile d'olive vierge extra | 10 |
|---|---|
| Coquillettes | 9 |
| Couscous | 7 |
| Limonade | 7 |
| Mozzarella | 6 |
| Other values (9448) |
Length
| Max length | 126 |
|---|---|
| Median length | 24 |
| Mean length | 26.30972208 |
| Min length | 3 |
Characters and Unicode
| Total characters | 262229 |
|---|---|
| Distinct characters | 138 |
| Distinct categories | 17 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 5 ? |
Unique
| Unique | 9097 ? |
|---|---|
| Unique (%) | 91.3% |
Sample
| 1st row | Easy fruity |
|---|---|
| 2nd row | Légumes Secs Gourmands |
| 3rd row | Raifort doux d'Alsace |
| 4th row | Instant choco |
| 5th row | Le Pur Bœuf 5% de M.G. |
| Value | Count | Frequency (%) |
| Huile d'olive vierge extra | 10 | 0.1% |
| Coquillettes | 9 | 0.1% |
| Couscous | 7 | 0.1% |
| Limonade | 7 | 0.1% |
| Mozzarella | 6 | 0.1% |
| Pois chiches | 6 | 0.1% |
| Corn Flakes | 6 | 0.1% |
| Jus d'ananas | 5 | 0.1% |
| Sirop de grenadine | 5 | 0.1% |
| Gnocchi à poêler | 5 | 0.1% |
| Other values (9443) | 9901 | |
| (Missing) | 33 | 0.3% |
| Value | Count | Frequency (%) |
| de | 2499 | 5.9% |
| 1026 | 2.4% | |
| au | 843 | 2.0% |
| à | 635 | 1.5% |
| chocolat | 514 | 1.2% |
| et | 500 | 1.2% |
| aux | 466 | 1.1% |
| la | 452 | 1.1% |
| bio | 423 | 1.0% |
| lait | 394 | 0.9% |
| Other values (7147) | 34916 |
Most occurring characters
| Value | Count | Frequency (%) |
| 32988 | 12.6% | |
| e | 26323 | 10.0% |
| a | 18834 | 7.2% |
| i | 15453 | 5.9% |
| r | 14471 | 5.5% |
| o | 14460 | 5.5% |
| s | 13923 | 5.3% |
| t | 13202 | 5.0% |
| u | 11063 | 4.2% |
| n | 11032 | 4.2% |
| Other values (128) | 90480 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 194023 | |
| Space Separator | 32988 | 12.6% |
| Uppercase Letter | 26577 | 10.1% |
| Decimal Number | 3770 | 1.4% |
| Other Punctuation | 3006 | 1.1% |
| Dash Punctuation | 777 | 0.3% |
| Open Punctuation | 489 | 0.2% |
| Close Punctuation | 485 | 0.2% |
| Math Symbol | 76 | < 0.1% |
| Other Symbol | 20 | < 0.1% |
| Other values (7) | 18 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 26323 | |
| a | 18834 | |
| i | 15453 | 8.0% |
| r | 14471 | 7.5% |
| o | 14460 | 7.5% |
| s | 13923 | 7.2% |
| t | 13202 | 6.8% |
| u | 11063 | 5.7% |
| n | 11032 | 5.7% |
| l | 10797 | 5.6% |
| Other values (45) | 44465 |
| Value | Count | Frequency (%) |
| C | 3408 | |
| P | 2702 | 10.2% |
| S | 2203 | 8.3% |
| B | 2094 | 7.9% |
| M | 1932 | 7.3% |
| L | 1487 | 5.6% |
| F | 1446 | 5.4% |
| G | 1356 | 5.1% |
| T | 1263 | 4.8% |
| A | 1214 | 4.6% |
| Other values (25) | 7472 |
| Value | Count | Frequency (%) |
| ' | 954 | |
| , | 810 | |
| % | 440 | |
| & | 365 | 12.1% |
| . | 197 | 6.6% |
| / | 120 | 4.0% |
| ; | 43 | 1.4% |
| ! | 24 | 0.8% |
| : | 22 | 0.7% |
| ? | 16 | 0.5% |
| Other values (4) | 15 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 965 | |
| 2 | 619 | |
| 1 | 553 | |
| 5 | 405 | |
| 3 | 323 | 8.6% |
| 4 | 315 | 8.4% |
| 8 | 195 | 5.2% |
| 6 | 185 | 4.9% |
| 7 | 149 | 4.0% |
| 9 | 61 | 1.6% |
| Value | Count | Frequency (%) |
| ´ | 5 | |
| ¨ | 1 | 14.3% |
| ` | 1 | 14.3% |
| Value | Count | Frequency (%) |
| ( | 482 | |
| [ | 5 | 1.0% |
| { | 2 | 0.4% |
| Value | Count | Frequency (%) |
| + | 73 | |
| | | 2 | 2.6% |
| = | 1 | 1.3% |
| Value | Count | Frequency (%) |
| | 2 | |
| | 1 | |
| | 1 |
| Value | Count | Frequency (%) |
| ) | 480 | |
| ] | 5 | 1.0% |
| Value | Count | Frequency (%) |
| ° | 12 | |
| ® | 8 |
| Value | Count | Frequency (%) |
| ¢ | 1 | |
| € | 1 |
| Value | Count | Frequency (%) |
| 32988 |
| Value | Count | Frequency (%) |
| - | 777 |
| Value | Count | Frequency (%) |
| _ | 1 |
| Value | Count | Frequency (%) |
| ’ | 1 |
| Value | Count | Frequency (%) |
| º | 2 |
| Value | Count | Frequency (%) |
| ̀ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 220602 | |
| Common | 41626 | 15.9% |
| Inherited | 1 | < 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 26323 | 11.9% |
| a | 18834 | 8.5% |
| i | 15453 | 7.0% |
| r | 14471 | 6.6% |
| o | 14460 | 6.6% |
| s | 13923 | 6.3% |
| t | 13202 | 6.0% |
| u | 11063 | 5.0% |
| n | 11032 | 5.0% |
| l | 10797 | 4.9% |
| Other values (81) | 71044 |
| Value | Count | Frequency (%) |
| 32988 | ||
| 0 | 965 | 2.3% |
| ' | 954 | 2.3% |
| , | 810 | 1.9% |
| - | 777 | 1.9% |
| 2 | 619 | 1.5% |
| 1 | 553 | 1.3% |
| ( | 482 | 1.2% |
| ) | 480 | 1.2% |
| % | 440 | 1.1% |
| Other values (36) | 2558 | 6.1% |
| Value | Count | Frequency (%) |
| ̀ | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 256023 | |
| None | 6202 | 2.4% |
| Punctuation | 2 | < 0.1% |
| Diacriticals | 1 | < 0.1% |
| Currency Symbols | 1 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| 32988 | 12.9% | |
| e | 26323 | 10.3% |
| a | 18834 | 7.4% |
| i | 15453 | 6.0% |
| r | 14471 | 5.7% |
| o | 14460 | 5.6% |
| s | 13923 | 5.4% |
| t | 13202 | 5.2% |
| u | 11063 | 4.3% |
| n | 11032 | 4.3% |
| Other values (77) | 84274 |
| Value | Count | Frequency (%) |
| é | 3782 | |
| à | 635 | 10.2% |
| è | 538 | 8.7% |
| â | 300 | 4.8% |
| ê | 194 | 3.1% |
| û | 135 | 2.2% |
| ç | 90 | 1.5% |
| ô | 81 | 1.3% |
| œ | 75 | 1.2% |
| É | 72 | 1.2% |
| Other values (37) | 300 | 4.8% |
| Value | Count | Frequency (%) |
| ’ | 1 | |
| … | 1 |
| Value | Count | Frequency (%) |
| ̀ | 1 |
| Value | Count | Frequency (%) |
| € | 1 |
| Distinct | 4223 |
|---|---|
| Distinct (%) | 42.5% |
| Missing | 69 |
| Missing (%) | 0.7% |
| Memory size | 78.2 KiB |
| Carrefour | 317 |
|---|---|
| Auchan | 316 |
| U | 250 |
| Casino | 203 |
| Leader Price | 192 |
| Other values (4218) |
Length
| Max length | 90 |
|---|---|
| Median length | 9 |
| Mean length | 11.29543853 |
| Min length | 1 |
Characters and Unicode
| Total characters | 112175 |
|---|---|
| Distinct characters | 106 |
| Distinct categories | 11 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
Unique
| Unique | 3145 ? |
|---|---|
| Unique (%) | 31.7% |
Sample
| 1st row | Carrefour Kids |
|---|---|
| 2nd row | Tipiak |
| 3rd row | Alélor |
| 4th row | Naturella |
| 5th row | Charal |
| Value | Count | Frequency (%) |
| Carrefour | 317 | 3.2% |
| Auchan | 316 | 3.2% |
| U | 250 | 2.5% |
| Casino | 203 | 2.0% |
| Leader Price | 192 | 1.9% |
| Picard | 127 | 1.3% |
| Cora | 108 | 1.1% |
| Monoprix | 81 | 0.8% |
| Franprix | 72 | 0.7% |
| Fleury Michon | 68 | 0.7% |
| Other values (4213) | 8197 | |
| (Missing) | 69 | 0.7% |
| Value | Count | Frequency (%) |
| carrefour | 438 | 2.7% |
| auchan | 359 | 2.2% |
| u | 310 | 1.9% |
| casino | 270 | 1.6% |
| la | 267 | 1.6% |
| de | 247 | 1.5% |
| leader | 230 | 1.4% |
| repère | 213 | 1.3% |
| price | 205 | 1.2% |
| bio | 203 | 1.2% |
| Other values (4405) | 13693 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 11393 | 10.2% |
| a | 9608 | 8.6% |
| r | 9276 | 8.3% |
| i | 7904 | 7.0% |
| o | 6880 | 6.1% |
| 6507 | 5.8% | |
| n | 6194 | 5.5% |
| u | 4441 | 4.0% |
| s | 4377 | 3.9% |
| t | 4305 | 3.8% |
| Other values (96) | 41290 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 86120 | |
| Uppercase Letter | 16845 | 15.0% |
| Space Separator | 6507 | 5.8% |
| Other Punctuation | 2412 | 2.2% |
| Decimal Number | 81 | 0.1% |
| Dash Punctuation | 67 | 0.1% |
| Open Punctuation | 58 | 0.1% |
| Close Punctuation | 58 | 0.1% |
| Math Symbol | 23 | < 0.1% |
| Final Punctuation | 2 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 11393 | |
| a | 9608 | |
| r | 9276 | |
| i | 7904 | |
| o | 6880 | 8.0% |
| n | 6194 | 7.2% |
| u | 4441 | 5.2% |
| s | 4377 | 5.1% |
| t | 4305 | 5.0% |
| l | 4238 | 4.9% |
| Other values (35) | 17504 |
| Value | Count | Frequency (%) |
| C | 2009 | |
| M | 1658 | 9.8% |
| L | 1573 | 9.3% |
| P | 1296 | 7.7% |
| B | 1229 | 7.3% |
| A | 1054 | 6.3% |
| S | 977 | 5.8% |
| D | 844 | 5.0% |
| F | 773 | 4.6% |
| R | 679 | 4.0% |
| Other values (23) | 4753 |
| Value | Count | Frequency (%) |
| , | 1587 | |
| ' | 516 | 21.4% |
| & | 142 | 5.9% |
| . | 94 | 3.9% |
| ! | 64 | 2.7% |
| ? | 3 | 0.1% |
| ; | 2 | 0.1% |
| / | 2 | 0.1% |
| % | 1 | < 0.1% |
| : | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 3 | 16 | |
| 1 | 13 | |
| 2 | 9 | |
| 7 | 9 | |
| 4 | 9 | |
| 0 | 8 | |
| 9 | 6 | 7.4% |
| 5 | 5 | 6.2% |
| 6 | 3 | 3.7% |
| 8 | 3 | 3.7% |
| Value | Count | Frequency (%) |
| 谷 | 1 | |
| 优 | 1 |
| Value | Count | Frequency (%) |
| 6507 |
| Value | Count | Frequency (%) |
| - | 67 |
| Value | Count | Frequency (%) |
| + | 23 |
| Value | Count | Frequency (%) |
| ( | 58 |
| Value | Count | Frequency (%) |
| ) | 58 |
| Value | Count | Frequency (%) |
| ’ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 102965 | |
| Common | 9208 | 8.2% |
| Han | 2 | < 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 11393 | 11.1% |
| a | 9608 | 9.3% |
| r | 9276 | 9.0% |
| i | 7904 | 7.7% |
| o | 6880 | 6.7% |
| n | 6194 | 6.0% |
| u | 4441 | 4.3% |
| s | 4377 | 4.3% |
| t | 4305 | 4.2% |
| l | 4238 | 4.1% |
| Other values (68) | 34349 |
| Value | Count | Frequency (%) |
| 6507 | ||
| , | 1587 | 17.2% |
| ' | 516 | 5.6% |
| & | 142 | 1.5% |
| . | 94 | 1.0% |
| - | 67 | 0.7% |
| ! | 64 | 0.7% |
| ( | 58 | 0.6% |
| ) | 58 | 0.6% |
| + | 23 | 0.2% |
| Other values (16) | 92 | 1.0% |
| Value | Count | Frequency (%) |
| 谷 | 1 | |
| 优 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 110631 | |
| None | 1540 | 1.4% |
| Punctuation | 2 | < 0.1% |
| CJK | 2 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 11393 | 10.3% |
| a | 9608 | 8.7% |
| r | 9276 | 8.4% |
| i | 7904 | 7.1% |
| o | 6880 | 6.2% |
| 6507 | 5.9% | |
| n | 6194 | 5.6% |
| u | 4441 | 4.0% |
| s | 4377 | 4.0% |
| t | 4305 | 3.9% |
| Other values (67) | 39746 |
| Value | Count | Frequency (%) |
| é | 935 | |
| è | 379 | |
| ô | 47 | 3.1% |
| â | 43 | 2.8% |
| î | 21 | 1.4% |
| ê | 21 | 1.4% |
| É | 18 | 1.2% |
| ä | 13 | 0.8% |
| ü | 9 | 0.6% |
| ï | 8 | 0.5% |
| Other values (16) | 46 | 3.0% |
| Value | Count | Frequency (%) |
| ’ | 2 |
| Value | Count | Frequency (%) |
| 谷 | 1 | |
| 优 | 1 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| France |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 60000 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | France |
|---|---|
| 2nd row | France |
| 3rd row | France |
| 4th row | France |
| 5th row | France |
| Value | Count | Frequency (%) |
| France | 10000 |
| Value | Count | Frequency (%) |
| france | 10000 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 10000 | |
| r | 10000 | |
| a | 10000 | |
| n | 10000 | |
| c | 10000 | |
| e | 10000 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 50000 | |
| Uppercase Letter | 10000 | 16.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| r | 10000 | |
| a | 10000 | |
| n | 10000 | |
| c | 10000 | |
| e | 10000 |
| Value | Count | Frequency (%) |
| F | 10000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 60000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| F | 10000 | |
| r | 10000 | |
| a | 10000 | |
| n | 10000 | |
| c | 10000 | |
| e | 10000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 60000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| F | 10000 | |
| r | 10000 | |
| a | 10000 | |
| n | 10000 | |
| c | 10000 | |
| e | 10000 |
| Distinct | 7791 |
|---|---|
| Distinct (%) | 98.5% |
| Missing | 2094 |
| Missing (%) | 20.9% |
| Memory size | 78.2 KiB |
| Semoule de _blé_ dur de qualité supérieure. | 16 |
|---|---|
| Semoule de blé dur de qualité supérieure. | 9 |
| Haricots verts, eau, sel. | 7 |
| 100 % semoule de _blé_ dur de qualité supérieure. | 6 |
| Lait demi-écrémé stérilisé UHT. | 4 |
| Other values (7786) |
Length
| Max length | 3856 |
|---|---|
| Median length | 201 |
| Mean length | 270.1463445 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2135777 |
|---|---|
| Distinct characters | 160 |
| Distinct categories | 18 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 6 ? |
Unique
| Unique | 7722 ? |
|---|---|
| Unique (%) | 97.7% |
Sample
| 1st row | Composition / Samenstellin OF$is.sarteatx 0,15%1 sucre, acide cMqUe, antioxydant acide ()GeanJnatFeerde Frai " m*des. gpnncentrattl), tropical smaak. Ingred'féflten. Ilit saponcentraten 12% (ananas 5306'sinaasappel 45% passievycht 0,6% ab(ikoos 0,306 guave 03, mandari!/or mi?eZUIF, stabi!lütbr. pétiœs |
|---|---|
| 2nd row | Semoule de _blé_ dur précuite (_gluten_) 42%, farine de lentilles 23%, flocons de _soja_ 8%, pois cassés précuits déshydratés 7%, lentilles entières précuites déshydratées 6,5 %, pépites de _lupin_ 5,5%, flocons d'_orge_ (_gluten_), sel, huile de tournesol, carottes déshydratées, arômes. |
| 3rd row | Racines de raifort, eau, huile de tournesol, amidon odifié, acidifiants : acétate de sodium, te vinaigre, sel, amidon m citique,épaississant comme xanthane, arômes naturels, conserra sorbate de disulfite de sodium, colorant : dioxyde Milder Meerrettich aus dem Elsass. Zutaten: -Meerrettichwurzeln,wax Sonnenblumenôl, Weizenstârke, Essig, Salz, modifizierte Sâuerungsmittel: Natriumacetat, Zitronensâure, Verdickungsm± Xanthan, natürliche Aromen, Konservierungsstoffe: Kalium* Natriummetabisulfit, Farbstoff: Titandicxid. Mild h?seradish from Alsace. IngredienÉ: hN3erafflsh roob, modifieds gum,naturai sorbate,sodium metabisu/flte, soentztitanium bis: siehe Nach dem ôffnen kühPlagerFifBèétt*re : see date on opened keep III I I I II I |
| 4th row | 100% pure viande de bœuf. |
| 5th row | 100 % semoule de _blé_ dur* de qualité supérieure. *issu de la filière Alpina Savoie. |
| Value | Count | Frequency (%) |
| Semoule de _blé_ dur de qualité supérieure. | 16 | 0.2% |
| Semoule de blé dur de qualité supérieure. | 9 | 0.1% |
| Haricots verts, eau, sel. | 7 | 0.1% |
| 100 % semoule de _blé_ dur de qualité supérieure. | 6 | 0.1% |
| Lait demi-écrémé stérilisé UHT. | 4 | < 0.1% |
| Jus d'orange | 4 | < 0.1% |
| Jus de pomme. | 4 | < 0.1% |
| 100 % semoule de blé dur de qualité supérieure. | 3 | < 0.1% |
| Miel | 3 | < 0.1% |
| Riz | 3 | < 0.1% |
| Other values (7781) | 7847 | |
| (Missing) | 2094 | 20.9% |
| Value | Count | Frequency (%) |
| de | 35392 | 10.9% |
| 22107 | 6.8% | |
| sel | 6315 | 1.9% |
| sucre | 4751 | 1.5% |
| lait | 4709 | 1.5% |
| eau | 4207 | 1.3% |
| et | 3802 | 1.2% |
| blé | 3717 | 1.1% |
| huile | 3214 | 1.0% |
| poudre | 3180 | 1.0% |
| Other values (24249) | 233140 |
Most occurring characters
| Value | Count | Frequency (%) |
| 318000 | 14.9% | |
| e | 207074 | 9.7% |
| a | 125972 | 5.9% |
| r | 112677 | 5.3% |
| i | 111623 | 5.2% |
| s | 101589 | 4.8% |
| t | 97321 | 4.6% |
| o | 94378 | 4.4% |
| n | 87509 | 4.1% |
| , | 79142 | 3.7% |
| Other values (150) | 800492 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1485992 | |
| Space Separator | 318081 | 14.9% |
| Other Punctuation | 137157 | 6.4% |
| Uppercase Letter | 78182 | 3.7% |
| Decimal Number | 55642 | 2.6% |
| Connector Punctuation | 20471 | 1.0% |
| Open Punctuation | 16322 | 0.8% |
| Close Punctuation | 15984 | 0.7% |
| Dash Punctuation | 7183 | 0.3% |
| Final Punctuation | 258 | < 0.1% |
| Other values (8) | 505 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 207074 | |
| a | 125972 | 8.5% |
| r | 112677 | 7.6% |
| i | 111623 | 7.5% |
| s | 101589 | 6.8% |
| t | 97321 | 6.5% |
| o | 94378 | 6.4% |
| n | 87509 | 5.9% |
| u | 74507 | 5.0% |
| l | 74503 | 5.0% |
| Other values (40) | 398839 |
| Value | Count | Frequency (%) |
| E | 11580 | |
| A | 5802 | 7.4% |
| S | 5747 | 7.4% |
| I | 5409 | 6.9% |
| C | 4504 | 5.8% |
| O | 4112 | 5.3% |
| T | 4022 | 5.1% |
| R | 3902 | 5.0% |
| P | 3818 | 4.9% |
| L | 3672 | 4.7% |
| Other values (33) | 25614 |
| Value | Count | Frequency (%) |
| , | 79142 | |
| % | 14790 | 10.8% |
| . | 12995 | 9.5% |
| : | 12868 | 9.4% |
| ' | 6444 | 4.7% |
| * | 4304 | 3.1% |
| ; | 2515 | 1.8% |
| / | 2109 | 1.5% |
| • | 678 | 0.5% |
| ? | 517 | 0.4% |
| Other values (7) | 795 | 0.6% |
| Value | Count | Frequency (%) |
| 1 | 9562 | |
| 0 | 9224 | |
| 2 | 7299 | |
| 5 | 6101 | |
| 3 | 5611 | |
| 4 | 5120 | |
| 6 | 4020 | |
| 7 | 3375 | 6.1% |
| 8 | 2729 | 4.9% |
| 9 | 2601 | 4.7% |
| Value | Count | Frequency (%) |
| + | 165 | |
| = | 49 | 19.1% |
| ± | 30 | 11.7% |
| | | 7 | 2.7% |
| < | 4 | 1.6% |
| ~ | 1 | 0.4% |
| Value | Count | Frequency (%) |
| ( | 15631 | |
| [ | 608 | 3.7% |
| { | 71 | 0.4% |
| „ | 12 | 0.1% |
| Value | Count | Frequency (%) |
| ` | 5 | |
| ´ | 2 | 22.2% |
| ¨ | 1 | 11.1% |
| ^ | 1 | 11.1% |
| Value | Count | Frequency (%) |
| ) | 15367 | |
| ] | 521 | 3.3% |
| } | 96 | 0.6% |
| Value | Count | Frequency (%) |
| - | 7019 | |
| — | 148 | 2.1% |
| – | 16 | 0.2% |
| Value | Count | Frequency (%) |
| ‘ | 53 | |
| « | 24 | |
| “ | 19 | 19.8% |
| Value | Count | Frequency (%) |
| ’ | 228 | |
| » | 28 | 10.9% |
| ” | 2 | 0.8% |
| Value | Count | Frequency (%) |
| ® | 12 | |
| ° | 11 | |
| ™ | 1 | 4.2% |
| Value | Count | Frequency (%) |
| | 1 | |
| | 1 | |
| | 1 |
| Value | Count | Frequency (%) |
| 318000 | ||
| 81 | < 0.1% |
| Value | Count | Frequency (%) |
| $ | 85 | |
| € | 26 | 23.4% |
| Value | Count | Frequency (%) |
| | 1 | |
| | 1 |
| Value | Count | Frequency (%) |
| _ | 20471 |
| Value | Count | Frequency (%) |
| ¹ | 4 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1564174 | |
| Common | 571603 | 26.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 207074 | |
| a | 125972 | 8.1% |
| r | 112677 | 7.2% |
| i | 111623 | 7.1% |
| s | 101589 | 6.5% |
| t | 97321 | 6.2% |
| o | 94378 | 6.0% |
| n | 87509 | 5.6% |
| u | 74507 | 4.8% |
| l | 74503 | 4.8% |
| Other values (83) | 477021 |
| Value | Count | Frequency (%) |
| 318000 | ||
| , | 79142 | 13.8% |
| _ | 20471 | 3.6% |
| ( | 15631 | 2.7% |
| ) | 15367 | 2.7% |
| % | 14790 | 2.6% |
| . | 12995 | 2.3% |
| : | 12868 | 2.3% |
| 1 | 9562 | 1.7% |
| 0 | 9224 | 1.6% |
| Other values (57) | 63553 | 11.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2080199 | |
| None | 54299 | 2.5% |
| Punctuation | 1157 | 0.1% |
| Alphabetic PF | 95 | < 0.1% |
| Currency Symbols | 26 | < 0.1% |
| Letterlike Symbols | 1 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| 318000 | ||
| e | 207074 | 10.0% |
| a | 125972 | 6.1% |
| r | 112677 | 5.4% |
| i | 111623 | 5.4% |
| s | 101589 | 4.9% |
| t | 97321 | 4.7% |
| o | 94378 | 4.5% |
| n | 87509 | 4.2% |
| , | 79142 | 3.8% |
| Other values (83) | 744914 |
| Value | Count | Frequency (%) |
| é | 37094 | |
| ô | 4201 | 7.7% |
| è | 3061 | 5.6% |
| à | 2468 | 4.5% |
| â | 1496 | 2.8% |
| ï | 1202 | 2.2% |
| œ | 940 | 1.7% |
| É | 826 | 1.5% |
| ê | 396 | 0.7% |
| ü | 395 | 0.7% |
| Other values (44) | 2220 | 4.1% |
| Value | Count | Frequency (%) |
| • | 678 | |
| ’ | 228 | 19.7% |
| — | 148 | 12.8% |
| ‘ | 53 | 4.6% |
| “ | 19 | 1.6% |
| – | 16 | 1.4% |
| „ | 12 | 1.0% |
| ” | 2 | 0.2% |
| ‰ | 1 | 0.1% |
| Value | Count | Frequency (%) |
| € | 26 |
| Value | Count | Frequency (%) |
| fi | 72 | |
| fl | 23 | 24.2% |
| Value | Count | Frequency (%) |
| ™ | 1 |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 396 |
| Missing (%) | 4.0% |
| Memory size | 78.2 KiB |
| d | |
|---|---|
| c | |
| e | |
| a | |
| b |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 9604 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | e |
|---|---|
| 2nd row | a |
| 3rd row | d |
| 4th row | d |
| 5th row | a |
| Value | Count | Frequency (%) |
| d | 2659 | |
| c | 2076 | |
| e | 1939 | |
| a | 1525 | |
| b | 1405 | |
| (Missing) | 396 | 4.0% |
| Value | Count | Frequency (%) |
| d | 2659 | |
| c | 2076 | |
| e | 1939 | |
| a | 1525 | |
| b | 1405 |
Most occurring characters
| Value | Count | Frequency (%) |
| d | 2659 | |
| c | 2076 | |
| e | 1939 | |
| a | 1525 | |
| b | 1405 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9604 |
Most frequent character per category
| Value | Count | Frequency (%) |
| d | 2659 | |
| c | 2076 | |
| e | 1939 | |
| a | 1525 | |
| b | 1405 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9604 |
Most frequent character per script
| Value | Count | Frequency (%) |
| d | 2659 | |
| c | 2076 | |
| e | 1939 | |
| a | 1525 | |
| b | 1405 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9604 |
Most frequent character per block
| Value | Count | Frequency (%) |
| d | 2659 | |
| c | 2076 | |
| e | 1939 | |
| a | 1525 | |
| b | 1405 |
| Distinct | 2245 |
|---|---|
| Distinct (%) | 22.6% |
| Missing | 57 |
| Missing (%) | 0.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1115.18276 |
| Minimum | 0 |
|---|---|
| Maximum | 3766 |
| Zeros | 79 |
| Zeros (%) | 0.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 113 |
| Q1 | 431 |
| median | 1040 |
| Q3 | 1644 |
| 95-th percentile | 2377 |
| Maximum | 3766 |
| Range | 3766 |
| Interquartile range (IQR) | 1213 |
Descriptive statistics
| Standard deviation | 779.9659188 |
|---|---|
| Coefficient of variation (CV) | 0.6994063637 |
| Kurtosis | -0.06547855236 |
| Mean | 1115.18276 |
| Median Absolute Deviation (MAD) | 608 |
| Skewness | 0.6126035923 |
| Sum | 11088262.18 |
| Variance | 608346.8345 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 79 | 0.8% |
| 192 | 47 | 0.5% |
| 201 | 33 | 0.3% |
| 4 | 33 | 0.3% |
| 3700 | 33 | 0.3% |
| 180 | 32 | 0.3% |
| 176 | 29 | 0.3% |
| 184 | 28 | 0.3% |
| 3766 | 27 | 0.3% |
| 209 | 25 | 0.2% |
| Other values (2235) | 9577 | |
| (Missing) | 57 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 79 | |
| 1 | 4 | < 0.1% |
| 1.4 | 3 | < 0.1% |
| 2 | 2 | < 0.1% |
| 3 | 6 | 0.1% |
| Value | Count | Frequency (%) |
| 3766 | 27 | |
| 3761 | 3 | < 0.1% |
| 3757 | 3 | < 0.1% |
| 3703 | 1 | < 0.1% |
| 3700 | 33 |
| Distinct | 674 |
|---|---|
| Distinct (%) | 8.1% |
| Missing | 1727 |
| Missing (%) | 17.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.46802738 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 729 |
| Zeros (%) | 7.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1.1 |
| median | 6.9 |
| Q3 | 21 |
| 95-th percentile | 44.66 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 19.9 |
Descriptive statistics
| Standard deviation | 17.51559611 |
|---|---|
| Coefficient of variation (CV) | 1.300531668 |
| Kurtosis | 6.924325493 |
| Mean | 13.46802738 |
| Median Absolute Deviation (MAD) | 6.7 |
| Skewness | 2.328063456 |
| Sum | 111420.9905 |
| Variance | 306.7961072 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 729 | 7.3% |
| 0.5 | 332 | 3.3% |
| 0.1 | 245 | 2.5% |
| 0.2 | 184 | 1.8% |
| 1 | 108 | 1.1% |
| 2 | 105 | 1.1% |
| 23 | 104 | 1.0% |
| 11 | 97 | 1.0% |
| 15 | 91 | 0.9% |
| 12 | 82 | 0.8% |
| Other values (664) | 6196 | |
| (Missing) | 1727 | 17.3% |
| Value | Count | Frequency (%) |
| 0 | 729 | |
| 0.001 | 2 | < 0.1% |
| 0.01 | 12 | 0.1% |
| 0.015 | 1 | < 0.1% |
| 0.02 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 59 | |
| 99.9 | 3 | < 0.1% |
| 99.8 | 4 | < 0.1% |
| 93.3 | 1 | < 0.1% |
| 92.7 | 1 | < 0.1% |
| Distinct | 559 |
|---|---|
| Distinct (%) | 5.7% |
| Missing | 248 |
| Missing (%) | 2.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.230006527 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 1304 |
| Zeros (%) | 13.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.3 |
| median | 1.9 |
| Q3 | 7 |
| 95-th percentile | 20 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 6.7 |
Descriptive statistics
| Standard deviation | 8.305038471 |
|---|---|
| Coefficient of variation (CV) | 1.587959485 |
| Kurtosis | 20.92941021 |
| Mean | 5.230006527 |
| Median Absolute Deviation (MAD) | 1.89 |
| Skewness | 3.590499788 |
| Sum | 51003.02365 |
| Variance | 68.973664 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1304 | 13.0% |
| 0.1 | 565 | 5.7% |
| 0.2 | 291 | 2.9% |
| 0.5 | 269 | 2.7% |
| 0.3 | 243 | 2.4% |
| 0.4 | 229 | 2.3% |
| 1 | 217 | 2.2% |
| 0.6 | 177 | 1.8% |
| 0.8 | 159 | 1.6% |
| 0.7 | 143 | 1.4% |
| Other values (549) | 6155 | |
| (Missing) | 248 | 2.5% |
| Value | Count | Frequency (%) |
| 0 | 1304 | |
| 0.0001 | 2 | < 0.1% |
| 0.001 | 6 | 0.1% |
| 0.003 | 1 | < 0.1% |
| 0.004 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 1 | |
| 95 | 1 | |
| 91 | 1 | |
| 90 | 1 | |
| 87 | 1 |
| Distinct | 949 |
|---|---|
| Distinct (%) | 11.6% |
| Missing | 1784 |
| Missing (%) | 17.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.44811332 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 413 |
| Zeros (%) | 4.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4 |
| median | 14.5 |
| Q3 | 52.825 |
| 95-th percentile | 76.35 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 48.825 |
Descriptive statistics
| Standard deviation | 27.25703737 |
|---|---|
| Coefficient of variation (CV) | 0.9930386493 |
| Kurtosis | -0.8944571594 |
| Mean | 27.44811332 |
| Median Absolute Deviation (MAD) | 13.6 |
| Skewness | 0.7219970518 |
| Sum | 225513.699 |
| Variance | 742.9460864 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 413 | 4.1% |
| 0.5 | 229 | 2.3% |
| 1 | 175 | 1.8% |
| 11 | 113 | 1.1% |
| 12 | 102 | 1.0% |
| 2 | 73 | 0.7% |
| 0.6 | 71 | 0.7% |
| 0.8 | 70 | 0.7% |
| 15 | 68 | 0.7% |
| 13 | 67 | 0.7% |
| Other values (939) | 6835 | |
| (Missing) | 1784 | 17.8% |
| Value | Count | Frequency (%) |
| 0 | 413 | |
| 0.001 | 1 | < 0.1% |
| 0.01 | 1 | < 0.1% |
| 0.014 | 1 | < 0.1% |
| 0.02 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 4 | |
| 99.9 | 1 | < 0.1% |
| 99.7 | 1 | < 0.1% |
| 99.6 | 2 | |
| 99.5 | 1 | < 0.1% |
| Distinct | 836 |
|---|---|
| Distinct (%) | 8.6% |
| Missing | 245 |
| Missing (%) | 2.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.33380468 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 722 |
| Zeros (%) | 7.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 4 |
| Q3 | 17.9 |
| 95-th percentile | 56.7 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 16.9 |
Descriptive statistics
| Standard deviation | 19.12643104 |
|---|---|
| Coefficient of variation (CV) | 1.434431619 |
| Kurtosis | 3.331111484 |
| Mean | 13.33380468 |
| Median Absolute Deviation (MAD) | 3.73 |
| Skewness | 1.922233602 |
| Sum | 130071.2647 |
| Variance | 365.8203644 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 722 | 7.2% |
| 0.5 | 491 | 4.9% |
| 1 | 277 | 2.8% |
| 2 | 181 | 1.8% |
| 0.7 | 165 | 1.7% |
| 0.6 | 163 | 1.6% |
| 3 | 149 | 1.5% |
| 0.1 | 139 | 1.4% |
| 0.8 | 132 | 1.3% |
| 0.9 | 117 | 1.2% |
| Other values (826) | 7219 | |
| (Missing) | 245 | 2.5% |
| Value | Count | Frequency (%) |
| 0 | 722 | |
| 0.0001 | 1 | < 0.1% |
| 0.001 | 6 | 0.1% |
| 0.0019 | 1 | < 0.1% |
| 0.01 | 9 | 0.1% |
| Value | Count | Frequency (%) |
| 100 | 4 | |
| 99.9 | 1 | < 0.1% |
| 99.6 | 1 | < 0.1% |
| 99.5 | 3 | |
| 99.3 | 1 | < 0.1% |
| Distinct | 317 |
|---|---|
| Distinct (%) | 4.8% |
| Missing | 3368 |
| Missing (%) | 33.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.719416435 |
| Minimum | 0 |
|---|---|
| Maximum | 99 |
| Zeros | 1943 |
| Zeros (%) | 19.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.4 |
| Q3 | 3.3 |
| 95-th percentile | 9.7 |
| Maximum | 99 |
| Range | 99 |
| Interquartile range (IQR) | 3.3 |
Descriptive statistics
| Standard deviation | 5.161008726 |
|---|---|
| Coefficient of variation (CV) | 1.897836851 |
| Kurtosis | 91.55494292 |
| Mean | 2.719416435 |
| Median Absolute Deviation (MAD) | 1.4 |
| Skewness | 7.436731631 |
| Sum | 18035.1698 |
| Variance | 26.63601107 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1943 | |
| 0.5 | 263 | 2.6% |
| 1 | 186 | 1.9% |
| 3 | 170 | 1.7% |
| 2 | 151 | 1.5% |
| 1.5 | 134 | 1.3% |
| 0.1 | 122 | 1.2% |
| 2.5 | 106 | 1.1% |
| 1.9 | 97 | 1.0% |
| 0.9 | 94 | 0.9% |
| Other values (307) | 3366 | |
| (Missing) | 3368 |
| Value | Count | Frequency (%) |
| 0 | 1943 | |
| 0.0001 | 1 | < 0.1% |
| 0.0007 | 1 | < 0.1% |
| 0.001 | 7 | 0.1% |
| 0.002 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 99 | 1 | |
| 98 | 1 | |
| 89 | 1 | |
| 80 | 1 | |
| 76.25 | 1 |
| Distinct | 520 |
|---|---|
| Distinct (%) | 5.2% |
| Missing | 80 |
| Missing (%) | 0.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.781294667 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 662 |
| Zeros (%) | 6.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1.59 |
| median | 5.9 |
| Q3 | 11 |
| 95-th percentile | 24 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 9.41 |
Descriptive statistics
| Standard deviation | 8.158415412 |
|---|---|
| Coefficient of variation (CV) | 1.048465038 |
| Kurtosis | 12.93757572 |
| Mean | 7.781294667 |
| Median Absolute Deviation (MAD) | 4.5 |
| Skewness | 2.436764372 |
| Sum | 77190.4431 |
| Variance | 66.55974203 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 662 | 6.6% |
| 0.5 | 391 | 3.9% |
| 12 | 160 | 1.6% |
| 0.7 | 160 | 1.6% |
| 13 | 155 | 1.6% |
| 6 | 143 | 1.4% |
| 0.1 | 137 | 1.4% |
| 0.6 | 136 | 1.4% |
| 11 | 129 | 1.3% |
| 1 | 126 | 1.3% |
| Other values (510) | 7721 |
| Value | Count | Frequency (%) |
| 0 | 662 | |
| 0.0001 | 1 | < 0.1% |
| 0.001 | 3 | < 0.1% |
| 0.01 | 15 | 0.1% |
| 0.02 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 1 | |
| 88 | 1 | |
| 87.8 | 1 | |
| 85.2 | 1 | |
| 85 | 2 |
| Distinct | 987 |
|---|---|
| Distinct (%) | 10.1% |
| Missing | 238 |
| Missing (%) | 2.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.13986321 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 999 |
| Zeros (%) | 10.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.06 |
| median | 0.53745 |
| Q3 | 1.25 |
| 95-th percentile | 3.2 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 1.19 |
Descriptive statistics
| Standard deviation | 3.779008701 |
|---|---|
| Coefficient of variation (CV) | 3.315317723 |
| Kurtosis | 296.299645 |
| Mean | 1.13986321 |
| Median Absolute Deviation (MAD) | 0.51245 |
| Skewness | 15.00190663 |
| Sum | 11127.34465 |
| Variance | 14.28090676 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 999 | 10.0% |
| 0.01 | 427 | 4.3% |
| 0.1 | 396 | 4.0% |
| 1 | 273 | 2.7% |
| 0.02 | 197 | 2.0% |
| 1.3 | 181 | 1.8% |
| 0.03 | 171 | 1.7% |
| 1.1 | 171 | 1.7% |
| 1.5 | 170 | 1.7% |
| 1.2 | 160 | 1.6% |
| Other values (977) | 6617 | |
| (Missing) | 238 | 2.4% |
| Value | Count | Frequency (%) |
| 0 | 999 | |
| 0.0001 | 2 | < 0.1% |
| 0.00018 | 1 | < 0.1% |
| 0.00025 | 1 | < 0.1% |
| 0.0006858 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 3 | |
| 97.5 | 1 | < 0.1% |
| 80.5 | 1 | < 0.1% |
| 80 | 1 | < 0.1% |
| 61.9 | 1 | < 0.1% |
| Distinct | 48 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 396 |
| Missing (%) | 4.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.833402749 |
| Minimum | -13 |
|---|---|
| Maximum | 35 |
| Zeros | 500 |
| Zeros (%) | 5.0% |
| Negative | 1529 |
| Negative (%) | 15.3% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | -13 |
|---|---|
| 5-th percentile | -5 |
| Q1 | 1 |
| median | 9 |
| Q3 | 15 |
| 95-th percentile | 24 |
| Maximum | 35 |
| Range | 48 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 8.906700757 |
|---|---|
| Coefficient of variation (CV) | 1.008297823 |
| Kurtosis | -0.946583802 |
| Mean | 8.833402749 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.1644690042 |
| Sum | 84836 |
| Variance | 79.32931838 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 500 | 5.0% |
| 2 | 488 | 4.9% |
| 1 | 469 | 4.7% |
| 14 | 467 | 4.7% |
| 11 | 436 | 4.4% |
| 3 | 433 | 4.3% |
| 13 | 414 | 4.1% |
| 12 | 390 | 3.9% |
| 4 | 373 | 3.7% |
| -1 | 357 | 3.6% |
| Other values (38) | 5277 | |
| (Missing) | 396 | 4.0% |
| Value | Count | Frequency (%) |
| -13 | 1 | < 0.1% |
| -12 | 2 | < 0.1% |
| -11 | 5 | |
| -10 | 9 | |
| -9 | 12 |
| Value | Count | Frequency (%) |
| 35 | 1 | < 0.1% |
| 34 | 2 | < 0.1% |
| 32 | 2 | < 0.1% |
| 31 | 4 | |
| 30 | 6 |
| Distinct | 99 |
|---|---|
| Distinct (%) | 29.1% |
| Missing | 9660 |
| Missing (%) | 96.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.01079412 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 92 |
| Zeros (%) | 0.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 34.5 |
| Q3 | 60 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 60 |
Descriptive statistics
| Standard deviation | 34.44168914 |
|---|---|
| Coefficient of variation (CV) | 0.9305849809 |
| Kurtosis | -1.021154761 |
| Mean | 37.01079412 |
| Median Absolute Deviation (MAD) | 30.2 |
| Skewness | 0.4965712486 |
| Sum | 12583.67 |
| Variance | 1186.229951 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 92 | 0.9% |
| 50 | 31 | 0.3% |
| 100 | 30 | 0.3% |
| 12 | 12 | 0.1% |
| 60 | 10 | 0.1% |
| 65 | 10 | 0.1% |
| 55 | 6 | 0.1% |
| 99 | 5 | 0.1% |
| 10 | 5 | 0.1% |
| 53 | 4 | < 0.1% |
| Other values (89) | 135 | 1.4% |
| (Missing) | 9660 |
| Value | Count | Frequency (%) |
| 0 | 92 | |
| 1.5 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 3.5 | 1 | < 0.1% |
| 5 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 30 | |
| 99.9 | 2 | < 0.1% |
| 99.7 | 1 | < 0.1% |
| 99.2 | 1 | < 0.1% |
| 99 | 5 | 0.1% |
| Distinct | 6171 |
|---|---|
| Distinct (%) | 75.4% |
| Missing | 1814 |
| Missing (%) | 18.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1103.202935 |
| Minimum | 0 |
|---|---|
| Maximum | 3968.2 |
| Zeros | 83 |
| Zeros (%) | 0.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 107.1 |
| Q1 | 424.975 |
| median | 1021.4 |
| Q3 | 1631.975 |
| 95-th percentile | 2367 |
| Maximum | 3968.2 |
| Range | 3968.2 |
| Interquartile range (IQR) | 1207 |
Descriptive statistics
| Standard deviation | 780.5005442 |
|---|---|
| Coefficient of variation (CV) | 0.7074859207 |
| Kurtosis | 0.1249170228 |
| Mean | 1103.202935 |
| Median Absolute Deviation (MAD) | 602.95 |
| Skewness | 0.6709130098 |
| Sum | 9030819.226 |
| Variance | 609181.0995 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 83 | 0.8% |
| 3800 | 52 | 0.5% |
| 3496 | 18 | 0.2% |
| 187 | 16 | 0.2% |
| 1521 | 13 | 0.1% |
| 1504 | 12 | 0.1% |
| 196.8 | 12 | 0.1% |
| 204 | 11 | 0.1% |
| 170 | 11 | 0.1% |
| 212.5 | 11 | 0.1% |
| Other values (6161) | 7947 | |
| (Missing) | 1814 | 18.1% |
| Value | Count | Frequency (%) |
| 0 | 83 | |
| 1.7 | 2 | < 0.1% |
| 3.4 | 7 | 0.1% |
| 4.67 | 1 | < 0.1% |
| 5.1 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 3968.2 | 1 | < 0.1% |
| 3809.4 | 2 | < 0.1% |
| 3801.7 | 1 | < 0.1% |
| 3800.034 | 1 | < 0.1% |
| 3800 | 52 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | Unnamed: 0 | product_name | brands | countries | ingredients_text | nutrition_grade_fr | energy_100g | fat_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | fruits-vegetables-nuts_100g | calculated_energy | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 56740 | 258083 | Easy fruity | Carrefour Kids | France | Composition / Samenstellin OF$is.sarteatx 0,15%1 sucre, acide cMqUe, antioxydant acide ()GeanJnatFeerde Frai " m*des. gpnncentrattl), tropical smaak. Ingred'féflten. Ilit saponcentraten 12% (ananas 5306'sinaasappel 45% passievycht 0,6% ab(ikoos 0,306 guave 03, mandari!/or mi?eZUIF, stabi!lütbr. pétiœs | e | 146.0 | 0.0 | 0.0 | 8.7 | 8.7 | 0.0 | 0.0 | 0.000 | 11.0 | NaN | 147.9 |
| 1 | 64476 | 267489 | Légumes Secs Gourmands | Tipiak | France | Semoule de _blé_ dur précuite (_gluten_) 42%, farine de lentilles 23%, flocons de _soja_ 8%, pois cassés précuits déshydratés 7%, lentilles entières précuites déshydratées 6,5 %, pépites de _lupin_ 5,5%, flocons d'_orge_ (_gluten_), sel, huile de tournesol, carottes déshydratées, arômes. | a | 1490.0 | 4.1 | 0.6 | 54.0 | 4.5 | NaN | 20.0 | 1.300 | -1.0 | NaN | 1413.8 |
| 2 | 39562 | 233689 | Raifort doux d'Alsace | Alélor | France | Racines de raifort, eau, huile de tournesol, amidon odifié, acidifiants : acétate de sodium, te vinaigre, sel, amidon m citique,épaississant comme xanthane, arômes naturels, conserra sorbate de disulfite de sodium, colorant : dioxyde Milder Meerrettich aus dem Elsass. Zutaten: -Meerrettichwurzeln,wax Sonnenblumenôl, Weizenstârke, Essig, Salz, modifizierte Sâuerungsmittel: Natriumacetat, Zitronensâure, Verdickungsm± Xanthan, natürliche Aromen, Konservierungsstoffe: Kalium* Natriummetabisulfit, Farbstoff: Titandicxid. Mild h?seradish from Alsace. IngredienÉ: hN3erafflsh roob, modifieds gum,naturai sorbate,sodium metabisu/flte, soentztitanium bis: siehe Nach dem ôffnen kühPlagerFifBèétt*re : see date on opened keep III I I I II I | d | 1172.0 | 24.3 | 2.6 | 18.0 | 2.8 | NaN | 1.2 | 1.500 | 11.0 | NaN | 1249.8 |
| 3 | 66314 | 270026 | Instant choco | Naturella | France | NaN | d | 1552.0 | NaN | 2.3 | NaN | 65.7 | 13.2 | 6.9 | 0.050 | 11.0 | NaN | NaN |
| 4 | 20256 | 209466 | Le Pur Bœuf 5% de M.G. | Charal | France | 100% pure viande de bœuf. | a | 534.0 | 5.0 | 2.1 | 0.0 | 0.0 | NaN | 20.5 | 0.180 | -2.0 | NaN | 538.5 |
| 5 | 76832 | 302765 | Corn Flakes | Carrefour | France | NaN | c | 1552.0 | NaN | 0.2 | NaN | 4.0 | 5.0 | 8.0 | 1.800 | 6.0 | NaN | NaN |
| 6 | 26498 | 217483 | Pâtes de Savoie, Frisettes | Alpina Savoie,Alpina | France | 100 % semoule de _blé_ dur* de qualité supérieure. *issu de la filière Alpina Savoie. | a | 1526.0 | 2.0 | 0.3 | 72.0 | 2.0 | 3.0 | 12.0 | 0.010 | -5.0 | NaN | 1504.0 |
| 7 | 15658 | 203380 | Gâche tranchée aux éclats de chocolat Pur beurre, 1 pièce | Auchan | France | NaN | e | 1736.0 | NaN | 12.0 | NaN | 22.0 | 0.0 | 8.3 | 1.000 | 23.0 | NaN | NaN |
| 8 | 30186 | 221955 | Œufs Calibre Moyen | Cora | France | _Œufs_ de poules élevées en cage. | a | 607.0 | 10.3 | 2.7 | 0.7 | 0.7 | 0.5 | 12.3 | 0.320 | -1.0 | NaN | 612.4 |
| 9 | 90578 | 352290 | Aachi Lime Pickle 300Gm | Aachi | France | lime en morceaux, huile végétale raffinée, sel, piment en poudre, moutarde, fénugrec en poudret ase-fétide et régulateur de l'acidité E260. Allergen Advice?. Contain Mustard. Klay Contain TreeNts, Peanutst G!uten, Soya, Sesame, Milk Products and Sulpt'ilte. CONTAY PERMITTED CLASS 11 PRESERVATIVE ssat Lic. No. 2415023000259 OPENING IF BOTTLE ts PUFFED 1 LEAKING ORY spooN | b | 628.0 | 7.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.3 | 0.586 | 2.0 | NaN | 288.1 |
Last rows
| df_index | Unnamed: 0 | product_name | brands | countries | ingredients_text | nutrition_grade_fr | energy_100g | fat_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | fruits-vegetables-nuts_100g | calculated_energy | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9990 | 86439 | 338456 | Sauce tomate cuisinée Parmesan | Heinz | France | Tomates (157g de tomates pour 100g de sauce), huile de tournesol, sucre, amidon modifié, parmesan 1% (lait), sel, arôme, extraits d'ail et d'oignons. | c | 323.0 | 3.6 | 0.6 | 9.2 | 7.4 | NaN | 1.5 | 0.8001 | 4.0 | NaN | 318.7 |
| 9991 | 105 | 11413 | Pistachios Salées | Wonderful | France | _Pistaches_ (98,5%), sel (1,5%). | d | 2444.0 | 46.0 | 5.6 | 17.4 | 7.8 | NaN | 21.4 | 1.3000 | 18.0 | NaN | 2407.6 |
| 9992 | 78187 | 305352 | Tendre Gaufre au chocolat belge | Lotus | France | Farine de _blé_, sucre, _œufs_, chocolat 16% (sucre, pâte de cacao, beurre de cacao, graisse butyrique (_lait_), émulsifiant (lécithine de _soja_, E476)), huiles végétales (colza, palme, noix de coco), sirop de glucose-fructose, fibre alimentaire (oligofructose), poudre de _lait_ écrémé, stabilisant (glycérol), sel, poudre à lever (diphosphate disodique, carbonate acide de sodium), arôme. | e | 1830.0 | 21.2 | 7.4 | 54.4 | 33.2 | 3.2 | 6.1 | 0.9600 | 19.0 | NaN | 1834.1 |
| 9993 | 87530 | 343471 | Jambon serrano chiffonade | Joada, Charcuterie Catalane | France | INGREDiENTS: jambon de porc, sel, dextrose, conservateurs (E-252, E-250), antioxydant (E-300). | d | 962.0 | 11.9 | 4.6 | 1.0 | 0.1 | NaN | 29.8 | 5.5000 | 16.0 | NaN | 975.8 |
| 9994 | 57686 | 259180 | Marshmallowsgoût Vanille - | Copains copines | France | Sirop de glucosefructose, sucre, eau, dextrose, gélatine de porc, arôme naturel de vanille, colorant : carmins. | d | 1406.0 | 0.5 | 0.5 | 80.5 | 65.9 | 0.5 | 3.5 | 0.0790 | 14.0 | 0.0 | 1447.0 |
| 9995 | 40012 | 234474 | Fromage frais 20%mg bio nature | Malo | France | NaN | a | 272.0 | NaN | 1.7 | NaN | 4.3 | 0.0 | 5.6 | 0.1000 | -2.0 | NaN | NaN |
| 9996 | 21624 | 211368 | Soda saveur Orange | Grand Jury | France | Eau gazéifiée ; jus de fruits à base de concentrés 12% (orange 10%, citron 2%) ; sucre ; acidifiant : acide citrique : arôme naturel d'orange et autres arômes naturels ; antioxydant : acide ascorbique ; colorant : caroténoïdes (E160a). | e | 134.0 | 0.0 | 0.0 | 8.0 | 8.0 | NaN | 0.0 | 0.0100 | 11.0 | 12.0 | 136.0 |
| 9997 | 9405 | 194171 | Fromage blanc saveur vanille | Activia | France | Fromage blanc (lait), lait entier, eau, crème (lait), sucre (7,4%), sirop de glucose-fructose (0,7%), épaississants : E 1422 (amidon transformé), E 440 (pectine), E 412 (gomme guar), protéines de lait, arôme, gélatine (non porcine), correcteurs d'acidité: E330 (acide citrique), E333 (citrate de calcium), 331 (citrate de sodium), colorants : E 160a (caroténoides), E 101 (riboflavines), ferments lactiques dont bifidobacterium (Bifidus ActiRegularis) (lait), gousses de vanille épuisées. Décor : écorce de vanille | c | 422.0 | 3.8 | 2.7 | 11.8 | 11.3 | 0.1 | 4.7 | 0.0900 | 3.0 | NaN | 424.9 |
| 9998 | 80624 | 311941 | Confiture D'oranges Aicha | Les Conserves De Meknès | France | NaN | d | 1063.0 | NaN | 0.4 | NaN | 63.0 | 1.0 | 0.4 | 0.0200 | 12.0 | NaN | NaN |
| 9999 | 50095 | 248832 | Thon Au Naturel Aro | Aro | France | thon Listao (Katsuwonus pelamis), eau, sel. | b | 414.0 | 0.6 | 0.5 | 0.0 | 0.0 | NaN | 23.5 | 1.2000 | 1.0 | NaN | 422.3 |